AITopics | control module

Collaborating Authors

control module

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

20e6b4dd2b1f82bc599c593882f67f75-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:25:31 GMT

arxiv preprint arxiv, editing, video, (14 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

20e6b4dd2b1f82bc599c593882f67f75-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 20:45:29 GMT

arxiv preprint arxiv, editing, video, (14 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

CusEnhancer: A Zero-Shot Scene and Controllability Enhancement Method for Photo Customization via ResInversion

Ren, Maoye, Vaddamanu, Praneetha, Xu, Jianjin, Frade, Fernando De la Torre

arXiv.org Artificial IntelligenceSep-26-2025

Recently remarkable progress has been made in synthesizing realistic human photos using text-to-image diffusion models. However, current approaches face degraded scenes, insufficient control, and suboptimal perceptual identity. We introduce CustomEnhancer, a novel framework to augment existing identity customization models. CustomEnhancer is a zero-shot enhancement pipeline that leverages face swapping techniques, pretrained diffusion model, to obtain additional representations in a zeroshot manner for encoding into personalized models. Through our proposed triple-flow fused PerGeneration approach, which identifies and combines two compatible counter-directional latent spaces to manipulate a pivotal space of personalized model, we unify the generation and reconstruction processes, realizing generation from three flows. Our pipeline also enables comprehensive training-free control over the generation process of personalized models, offering precise controlled personalization for them and eliminating the need for controller retraining for per-model. Besides, to address the high time complexity of null-text inversion (NTI), we introduce ResInversion, a novel inversion method that performs noise rectification via a pre-diffusion mechanism, reducing the inversion time by 129 times. Experiments demonstrate that CustomEnhancer reach SOTA results at scene diversity, identity fidelity, training-free controls, while also showing the efficiency of our ResInversion over NTI. The code will be made publicly available upon paper acceptance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.20775

Country:

North America > United States > Pennsylvania (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.46)

Add feedback

Cross-Modality Controlled Molecule Generation with Diffusion Language Model

Zhang, Yunzhe, Wang, Yifei, Nguyen, Khanh Vinh, Hong, Pengyu

arXiv.org Artificial IntelligenceAug-21-2025

They inject conditioning signals at the start of the training process and require retraining a new model from scratch whenever the constraint changes. However, real-world applications often involve multiple constraints across different modalities, and additional constraints may emerge over the course of a study. This raises a challenge: how to extend a pre-trained diffusion model not only to support cross-modality constraints but also to incorporate new ones without retraining. To tackle this problem, we propose the Cross-Modality Controlled Molecule Generation with Diffusion Language Model (CMCM-DLM), demonstrated by two distinct cross modalities: molecular structure and chemical properties. Our approach builds upon a pre-trained diffusion model, incorporating two trainable modules, the Structure Control Module (SCM) and the Property Control Module (PCM), and operates in two distinct phases during the generation process. In Phase I, we employs the SCM to inject structural constraints during the early diffusion steps, effectively anchoring the molecular backbone. Phase II builds on this by further introducing PCM to guide the later stages of inference to refine the generated molecules, ensuring their chemical properties match the specified targets. Experimental results on multiple datasets demonstrate the efficiency and adaptability of our approach, highlighting CMCM-DLM's significant advancement in molecular generation for drug discovery applications.

constraint, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.14748

Country:

Oceania > Samoa (0.04)
North America > United States > Massachusetts > Middlesex County > Waltham (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Voice Impression Control in Zero-Shot TTS

Fujita, Keinichi, Horiguchi, Shota, Ijima, Yusuke

arXiv.org Artificial IntelligenceJun-11-2025

Para-/non-linguistic information in speech is pivotal in shaping the listeners' impression. Although zero-shot text-to-speech (TTS) has achieved high speaker fidelity, modulating subtle para-/non-linguistic information to control perceived voice characteristics, i.e., impressions, remains challenging. We have therefore developed a voice impression control method in zero-shot TTS that utilizes a low-dimensional vector to represent the intensities of various voice impression pairs (e.g., dark-bright). The results of both objective and subjective evaluations have demonstrated our method's effectiveness in impression control. Furthermore, generating this vector via a large language model enables target-impression generation from a natural language description of the desired impression, thus eliminating the need for manual optimization.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.05688

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Design and Implementation of a Peer-to-Peer Communication, Modular and Decentral YellowCube UUV

Xu, Zhizun, Jia, Baozhu, Shi, Weichao

arXiv.org Artificial IntelligenceJun-10-2025

--The underwater Unmanned V ehicles(UUVs) are pivot tools for offshore engineering and oceanographic research. Most existing UUVs do not facilitate easy integration of new or upgraded sensors. A solution to this problem is to have a modular UUV system with changeable payload sections capable of carrying different sensor to suite different missions. The design and implementation of a modular and decentral UUV named Y ellowCube is presented in the paper . Instead a centralised software architecture which is adopted by the other modular underwater vehicles designs, a Peer-T o-Peer(P2P) communication mechanism is implemented among the UUV's modules. The experiments in the laboratory and sea trials have been executed to verify the performances of the UUV . Over the past few decades, the Unmanned Underwater V ehicles(UUVs) have become the essential tools in the offshore engineering and the ocean research. Their tasks ranges from the offshore engineering, oceanographic research, salvage and rescue to the military monitoring.

artificial intelligence, machine learning, module, (15 more...)

arXiv.org Artificial Intelligence

2506.07924

Country:

Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
Pacific Ocean > North Pacific Ocean > South China Sea (0.04)
Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Energy (0.94)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Robots (0.98)
Information Technology > Communications (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

SGN-CIRL: Scene Graph-based Navigation with Curriculum, Imitation, and Reinforcement Learning

Oskolkov, Nikita, Zhang, Huzhenyu, Makarov, Dmitry, Yudin, Dmitry, Panov, Aleksandr

arXiv.org Artificial IntelligenceJun-6-2025

-- The 3D scene graph models spatial relationships between objects, enabling the agent to efficiently navigate in a partially observable environment and predict the location of the target object. This paper proposes an original framework named SGN-CIRL (3D Scene Graph-Based Reinforcement Learning Navigation) for mapless reinforcement learning-based robot navigation with learnable representation of open-vocabulary 3D scene graph. T o accelerate and stabilize the training of reinforcement learning-based algorithms, the framework also employs imitation learning and curriculum learning. The first one enables the agent to learn from demonstrations, while the second one structures the training process by gradually increasing task complexity from simple to more advanced scenarios. Numerical experiments conducted in the Isaac Sim environment showed that using a 3D scene graph for reinforcement learning significantly increased the success rate in difficult navigation cases. The code is open-sourced and available at: https://github.com/Xisonik/Aloha

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2506.04505

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Russia (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Neural Network Mode for PX4 on Embedded Flight Controllers

Hegre, Sindre M., Rehberg, Welf, Kulkarni, Mihir, Alexis, Kostas

arXiv.org Artificial IntelligenceMay-2-2025

This paper contributes an open-sourced implementation of a neural-network based controller framework within the PX4 stack. We develop a custom module for inference on the microcontroller while retaining all of the functionality of the PX4 autopilot. Policies trained in the Aerial Gym Simulator are converted to the TensorFlow Lite format and then built together with PX4 and flashed to the flight controller. The policies substitute the control-cascade within PX4 to offer an end-to-end position-setpoint tracking controller directly providing normalized motor RPM setpoints. Experiments conducted in simulation and the real-world show similar tracking performance. We thus provide a flight-ready pipeline for testing neural control policies in the real world. The pipeline simplifies the deployment of neural networks on embedded flight controller hardware thereby accelerating research on learning-based control. Both the Aerial Gym Simulator and the PX4 module are open-sourced at https://github.com/ntnu-arl/aerial_gym_simulator and https://github.com/SindreMHegre/PX4-Autopilot-public/tree/for_paper. Video: https://youtu.be/lY1OKz_UOqM?si=VtzL243BAY3lblTJ.

artificial intelligence, controller, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.00432

Country:

North America > United States (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)

Genre: Research Report (0.52)

Industry: Transportation > Air (0.71)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis

Zhou, Xukun, Li, Fengxin, Chen, Ming, Zhou, Yan, Wan, Pengfei, Zhang, Di, Jin, Yeying, Fan, Zhaoxin, Liu, Hongyan, He, Jun

arXiv.org Artificial IntelligenceMar-15-2025

Audio-driven human gesture synthesis is a crucial task with broad applications in virtual avatars, human-computer interaction, and creative content generation. Despite notable progress, existing methods often produce gestures that are coarse, lack expressiveness, and fail to fully align with audio semantics. To address these challenges, we propose ExGes, a novel retrieval-enhanced diffusion framework with three key designs: (1) a Motion Base Construction, which builds a gesture library using training dataset; (2) a Motion Retrieval Module, employing constrative learning and momentum distillation for fine-grained reference poses retreiving; and (3) a Precision Control Module, integrating partial masking and stochastic masking to enable flexible and fine-grained control. Experimental evaluations on BEAT2 demonstrate that ExGes reduces Fr\'echet Gesture Distance by 6.2\% and improves motion diversity by 5.3\% over EMAGE, with user studies revealing a 71.3\% preference for its naturalness and semantic relevance. Code will be released upon acceptance.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.06499

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

HALO: Fault-Tolerant Safety Architecture For High-Speed Autonomous Racing

Harder, Aron, Kulkarni, Amar, Behl, Madhur

arXiv.org Artificial IntelligenceMar-13-2025

The field of high-speed autonomous racing has seen significant advances in recent years, with the rise of competitions such as RoboRace and the Indy Autonomous Challenge providing a platform for researchers to develop software stacks for autonomous race vehicles capable of reaching speeds in excess of 170 mph. Ensuring the safety of these vehicles requires the software to continuously monitor for different faults and erroneous operating conditions during high-speed operation, with the goal of mitigating any unreasonable risks posed by malfunctions in sub-systems and components. This paper presents a comprehensive overview of the HALO safety architecture, which has been implemented on a full-scale autonomous racing vehicle as part of the Indy Autonomous Challenge. The paper begins with a failure mode and criticality analysis of the perception, planning, control, and communication modules of the software stack. Specifically, we examine three different types of faults - node health, data health, and behavioral-safety faults. To mitigate these faults, the paper then outlines HALO safety archetypes and runtime monitoring methods. Finally, the paper demonstrates the effectiveness of the HALO safety architecture for each of the faults, through real-world data gathered from autonomous racing vehicle trials during multi-agent scenarios.

module, node, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2503.10341

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Virginia (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
(3 more...)

Genre:

Overview (0.68)
Research Report (0.50)

Industry:

Leisure & Entertainment > Sports > Motorsports (1.00)
Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.93)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
(3 more...)

Add feedback